Visual Reasoning


Towards Scalable Web Accessibility Audit with MLLMs as Copilots

Add code
Nov 05, 2025
Viaarxiv icon

Computational Imaging Meets LLMs: Zero-Shot IDH Mutation Prediction in Brain Gliomas

Add code
Nov 05, 2025
Viaarxiv icon

A Multi-Modal Neuro-Symbolic Approach for Spatial Reasoning-Based Visual Grounding in Robotics

Add code
Oct 30, 2025
Viaarxiv icon

Counteracting Matthew Effect in Self-Improvement of LVLMs through Head-Tail Re-balancing

Add code
Oct 30, 2025
Viaarxiv icon

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Add code
Oct 30, 2025
Viaarxiv icon

ChartAB: A Benchmark for Chart Grounding & Dense Alignment

Add code
Oct 30, 2025
Viaarxiv icon

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

Add code
Oct 31, 2025
Viaarxiv icon

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Add code
Oct 30, 2025
Viaarxiv icon

Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning

Add code
Oct 31, 2025
Viaarxiv icon

Unveiling Intrinsic Text Bias in Multimodal Large Language Models through Attention Key-Space Analysis

Add code
Oct 30, 2025
Viaarxiv icon